home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
System Booster
/
System Booster.iso
/
Archives
/
ForCLI
/
AGrep204.lha
/
agrep.chronicle
< prev
next >
Wrap
Text File
|
1992-03-16
|
3KB
|
65 lines
This chronicle briefly describes the progress of agrep.
Feb/91: The approximate pattern matching algorithm called 'bitap'
(bit-parallel approximate pattern matching) is designed.
The algorithm is a generalization of Baeza-Yates' "shift-or"
algorithm for exact matching.
Mar/91: Many extensions of the algorithm 'bitap' are found, especially
for approximate regular expression pattern matching. Preliminary
implementation of the algorithm showed a strong promise for
a general-purpose fast approximate pattern-matching tool.
Apr/91: Approximate regular expression pattern matching was implemented.
The result is even better than expected.
The design of the software tool is pinned down.
(For example, record oriented, multi-pattern, AND/OR logic queries.)
A partition technique for approximate pattern matching is used.
May/91: The prototype of "agrep" is completed.
A lot of debugging/optimization in this month.
Jun/91: The first version of agrep is released.
agrep 1.0 was announced and made available by anonymous ftp
from cs.arizona.edu.
Jul/91: A sub-linear expected-time algorithm, called "amonkey" for
approximate pattern matching (for simple pattern) is designed.
The algorithm has the same time complexity as that of
Chang&Lawler but is much much faster in practice.
The algorithm is based on a variation of Boyer-Moore technique,
which we call "block-shifting."
A sub-linear expected-time algorithm, called "mgrep" for
matching a set of patterns is designed based on the "block-shifting"
technique with a hashing technique.
Aug/91: "amonkey" is implemented and incorporated into agrep.
It is very fast for long patterns like DNA patterns.
(But roughly the same for matching English words as the bitap
algorithm using the partition technique.)
Prototype of "mgrep" is implemented.
Sep/91: "mgrep" is incorporated into agrep to support the -f option.
An algorithm for approximate pattern matching that combines the
'partition' technique with the sub-linear expected-time algorithm
for multi-patterns is designed.
Implementation shows it to be the fastest for ASCII text (and pattern).
Boyer-moore technique for exact matching is incorporated.
Nov/91: The final paper of "agrep" that is to appear in USENIX
conference (Jan 1992) is finished.
Jan/92: Some new options are added, such as find best matches (-B),
and file outputs (-G).
The man pages are revised.
agrep version 2.0 is released.
Fixed the following bugs and change the version to be 2.01.
1. -G option doesn't work correctly.
2. multiple definition of some global variables.
3. -# with -w forced the first character of the pattern to be matched
Mar/92: Fixed the following bugs and change the version to be 2.02.
1. agrep sometimes misses some matches for pipeline input.
2. the word delimiter was not defined consistantly.